Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CPU profiling sections of code #3971

Merged
merged 24 commits into from May 6, 2019
Merged

Conversation

breezewish
Copy link
Member

@breezewish breezewish commented Dec 24, 2018

Signed-off-by: Breezewish breezewish@pingcap.com

What have you changed? (mandatory)

We usually need to find out why specific sections of code are slow. However, sometimes it cannot be done via simple perf, i.e. when there is a bootstrap. This PR provides facility to programmatically start or stop CPU profiling. In this way, we can start profiling before our interested code begins and stop profiling after it ends.

We can explore more interesting usages in the future, i.e. toggle profiling dynamically via signals, or environment variables. Future PRs are welcome!

The manual is in the code. I paste it here for easy reading:


Profile a part of the code using CPU Profiler from gperftools or Callgrind.

Requirements

  1. Linux

    Other OS may also work however, not tested.

  2. gperftools

    You can follow its INSTALL manual.
    Roughly the instructions are the following:

    1. Download packages from release
    2. Run ./configure
    3. Run make install

Usage

profiler::start("./app.profile");
some_complex_code();
profiler::stop();

Then, compile the code with profiling feature enabled and run the code with environment
variable TIKV_PROFILE=1.

By default, a profile called app.profile will be generated by CPU Profiler.
You can then analyze the profile using pprof.

If the application is running in Callgrind, a Callgrind profile dump will be generated instead.
Notice that you should run Callgrind with command line option --instr-atstart=no, e.g.:

TIKV_PROFILE=1 valgrind --tool=callgrind --instr-atstart=no ./my_example

Also see examples/prime.rs.

Signed-off-by: Breezewish <breezewish@pingcap.com>
@siddontang
Copy link
Contributor

Do we need to install gperftools at first?

Signed-off-by: Breezewish <breezewish@pingcap.com>
@breezewish
Copy link
Member Author

@siddontang Yes. I copied the manual from code to the PR description now. I also updated the code to support Callgrind (so that we can profile cache hit etc).

Signed-off-by: Breezewish <breezewish@pingcap.com>
Signed-off-by: Breezewish <breezewish@pingcap.com>
@zhouqiang-cl

This comment has been minimized.

@siddontang
Copy link
Contributor

So if the user doesn't install perftools, can it build TiKV now?

I suggest you introduce valgrind in another PR.

@Connor1996 Connor1996 added the component/performance Component: Performance label Dec 25, 2018
@breezewish
Copy link
Member Author

breezewish commented Dec 25, 2018

@siddontang Yes. Only when profiling feature is enabled, gperftools is required. As you can see, it builds normally on Circle CI and our own Jenkins CI.

The support of Valgrind is only 4 lines of code. It should not be a big problem.

@siddontang
Copy link
Contributor

do we need to install vagrind manally too?

@siddontang
Copy link
Contributor

IMO, if we build with this feature, we don't need using environment forcibly to control it. Only calling start and stop is enough.

@breezewish
Copy link
Member Author

@siddontang Don't need to install valgrind to build it.

Ok

@breezewish
Copy link
Member Author

Updated. No need to set "TIKV_PROFILE=1" now.

Signed-off-by: Breezewish <breezewish@pingcap.com>
Signed-off-by: Breezewish <breezewish@pingcap.com>
Signed-off-by: Breezewish <breezewish@pingcap.com>
@breezewish
Copy link
Member Author

@brson PTAL for this~ Thanks a lot!

Signed-off-by: Breezewish <breezewish@pingcap.com>
@brson
Copy link
Contributor

brson commented Apr 12, 2019

This is very cool @breeswish.

I guess that the idea is to insert start and stop temporarily and then remove them when done? Otherwise there's a lot of deadlock potential.

One nice change would be to change the profiler manifest to make all the dependencies optional (particularly the profiling deps), and have the profiling target enable them. That would save the download, build, and link of the profiler crates.

Is there anything holding up landing this?

Maybe docs - like the allocator configuration this seems like something that should be surfaced in a developer's guide. Not necessary for this PR though.

Signed-off-by: Breezewish <breezewish@pingcap.com>
@breezewish
Copy link
Member Author

breezewish commented Apr 12, 2019

@brson @kennytm Thanks a lot for reviewing! I have updated my Cargo.toml to make these dependencies optional (not sure it there are better ways). Also updated docs for MacOS instructions.

I guess that the idea is to insert start and stop temporarily and then remove them when done? Otherwise there's a lot of deadlock potential.

Yes. A better way maybe, provide interfaces to trigger a start and stop it moment later so that a profile can be generated. This does not introduce runtime costs when profiling is not started. The interface can be our status server like #4444

Is there anything holding up landing this?

Nop, it's just lack of reviewers previously :)

@breezewish breezewish requested a review from brson April 12, 2019 09:17
@siddontang
Copy link
Contributor

@brson

I think we can advance this PR now.
But here I prefer using HTTP API to control profile.

@breezewish
Copy link
Member Author

@siddontang We can have another PR that utilize the interface via HTTP API.

@siddontang
Copy link
Contributor

PTAL @brson

@brson
Copy link
Contributor

brson commented Apr 26, 2019

@siddontang acknowledged that we want an HTTP interface, and that this is a good start.

brson
brson previously approved these changes Apr 26, 2019
@brson brson added the status/LGT1 Status: PR - There is already 1 approval label Apr 26, 2019
@brson
Copy link
Contributor

brson commented Apr 26, 2019

lgtm

Copy link
Member

@BusyJay BusyJay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can gperftools be installed automatically?

publish = false

[features]
profiling = ["lazy_static", "cpuprofiler", "callgrind", "valgrind_request"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it compile on Windows without the feature enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should work though I don't have a Windows machine. I changed it to [target.'cfg(linux)'.dependencies] and it can compile on my MacOS.

//! Also see `examples/prime.rs`.

#[allow(unused_extern_crates)]
extern crate tikv_alloc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use tikv_alloc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tikv_alloc generally needs to be linked into every crate that doesn't link to tikv, so that tests and benches of that crate use tikv's allocator. I don't think this extern crate statement is needed though as long as the dependency exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI will fail if alloc is not linked

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. There are tests that all binaries contain jemalloc.

components/profiler/src/lib.rs Outdated Show resolved Hide resolved
components/profiler/src/profiler_linux.rs Outdated Show resolved Hide resolved
[target.'cfg(unix)'.dependencies]
lazy_static = { version = "1.2.0", optional = true }
cpuprofiler = { version = "0.0.3", optional = true }
callgrind = { version = "1.1.0", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is valgrind a useful use case? I think most of the time only cpuprofiling is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valgrind is very useful for micro benchmark, which can report precise amount of function calls, as well as precise (emulated) cache hit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also hoping that this crate can be extended into a general purpose profiling module that can be published for the community, and ultimately fulfill the various requirements of the Go profiling tools. More profiler options in that case seems good.

Signed-off-by: Breezewish <breezewish@pingcap.com>
Signed-off-by: Breezewish <breezewish@pingcap.com>
Copy link
Contributor

@siddontang siddontang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@brson brson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@breezewish breezewish merged commit 68802bb into tikv:master May 6, 2019
@breezewish breezewish deleted the __profiler branch May 6, 2019 13:57
jswh pushed a commit to jswh/tikv that referenced this pull request May 27, 2019
Signed-off-by: Breezewish <breezewish@pingcap.com>
sticnarf pushed a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
Signed-off-by: Breezewish <breezewish@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/performance Component: Performance status/LGT1 Status: PR - There is already 1 approval
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants